14 research outputs found
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
Predicting virulence factors of immunological interest
This article does not have an abstract
Intrinsic contributions of polar amino acid residues toward thermal stability of an ABC–ATPase of mesophilic origin
The nucleotide-binding subunit of phosphate-specific transporter (PstB) from mesophilic bacterium, Mycobacterium tuberculosis, is a unique ATP-binding cassette (ABC) ATPase because of its unusual ability to hydrolyze ATP at high temperature. In an attempt to define the basis of thermostability, we took a theoretical approach and compared amino acid composition of this protein to that of other PstBs from available bacterial genomes. Interestingly, based on the content of polar amino acids, this protein clustered with the thermophiles
In Silico Analysis of Gene Expression Change Associated with Copy Number of Enhancers in Pancreatic Adenocarcinoma
Understanding the gene regulatory network governing cancer initiation and progression is necessary, although it remains largely unexplored. Enhancer elements represent the center of this regulatory circuit. The study aims to identify the gene expression change driven by copy number variation in enhancer elements of pancreatic adenocarcinoma (PAAD). The pancreatic tissue specific enhancer and target gene data were taken from EnhancerAtlas. The gene expression and copy number data were taken from The Cancer Genome Atlas (TCGA). Differentially expressed genes (DEGs) and copy number variations (CNVs) were identified between matched tumor-normal samples of PAAD. Significant CNVs were matched onto enhancer coordinates by using genomic intersection functionality from BEDTools. By combining the gene expression and CNV data, we identified 169 genes whose expression shows a positive correlation with the CNV of enhancers. We further identified 16 genes which are regulated by a super enhancer and 15 genes which have high prognostic potential (Z-score > 1.96). Cox proportional hazard analysis of these genes indicates that these are better predictors of survival. Taken together, our integrative analytical approach identifies enhancer CNV-driven gene expression change in PAAD, which could lead to better understanding of PAAD pathogenesis and to the design of enhancer-based cancer treatment strategies
Searching haptens, carrier proteins, and anti-hapten antibodies
This article does not have an abstract
Not Available
Not AvailableDomestic cow, Bos taurus is one of the important species
selected by humans for various traits, viz. milk yield, meat quality, draft
ability, resistance to disease and pests and social and religious reasons.
Since cattle domestication from Neolithic (8,000-10,000 years ago) today
the population has reached 1.5 billion and further it’s likely to be 2.6
billion by 2050. High magnitude of numbers, breed management, market
need of traceability of breed product, conservation prioritization and IPR
issues due to germplasm flow/exchange, has created a critical need for
accurate and rapid breed identification. Since ages the defined breed
descriptors has been used in identification of breed but due to lack of
phenotypic description especially in ova, semen, embryos and breed
products molecular approach is indispensable. Further the degree of
admixture and non-descript animals characterization, needs of molecular
approach is imperative. Till date breed identification methods based on
molecular data analysis has great limitations like lack of reference data
availability and need of computational expertise. To overcome these
challenges we developed a web server for maintaining reference data and
facility for breed identification. The reference data used for developing
prediction model were obtained from8 cattle breeds and 18 microsatellite
DNA markers yielding 18000 allele data. In this study various algorithms
were used for reducing number of loci or for identification of important
loci. Minimization up to 5 loci was achieved using memory-based learning
algorithm without compromising with accuracy of 95%. This model
approach and methodology can play immense role in all domestic animal
species across globe in breed identification and conservation programme.
This can also be modelled even for all flora and fauna to identify their
respective variety or breed needed in germplasm management.Not Availabl